217 research outputs found

    Powerful sequence similarity search methods and in-depth manual analyses can identify remote homologs in many apparently "orphan" viral proteins.

    Get PDF
    The genome sequences of new viruses often contain many "orphan" or "taxon-specific" proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as "genus specific" by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions

    ProRepeat: an integrated repository for studying amino acid tandem repeats in proteins

    Get PDF
    ProRepeat (http://prorepeat.bioinformatics.nl/) is an integrated curated repository and analysis platform for in-depth research on the biological characteristics of amino acid tandem repeats. ProRepeat collects repeats from all proteins included in the UniProt knowledgebase, together with 85 completely sequenced eukaryotic proteomes contained within the RefSeq collection. It contains non-redundant perfect tandem repeats, approximate tandem repeats and simple, low-complexity sequences, covering the majority of the amino acid tandem repeat patterns found in proteins. The ProRepeat web interface allows querying the repeat database using repeat characteristics like repeat unit and length, number of repetitions of the repeat unit and position of the repeat in the protein. Users can also search for repeats by the characteristics of repeat containing proteins, such as entry ID, protein description, sequence length, gene name and taxon. ProRepeat offers powerful analysis tools for finding biological interesting properties of repeats, such as the strong position bias of leucine repeats in the N-terminus of eukaryotic protein sequences, the differences of repeat abundance among proteomes, the functional classification of repeat containing proteins and GC content constrains of repeats’ corresponding codons

    Fluctuating selection models and Mcdonald-Kreitman type analyses

    Get PDF
    It is likely that the strength of selection acting upon a mutation varies through time due to changes in the environment. However, most population genetic theory assumes that the strength of selection remains constant. Here we investigate the consequences of fluctuating selection pressures on the quantification of adaptive evolution using McDonald-Kreitman (MK) style approaches. In agreement with previous work, we show that fluctuating selection can generate evidence of adaptive evolution even when the expected strength of selection on a mutation is zero. However, we also find that the mutations, which contribute to both polymorphism and divergence tend, on average, to be positively selected during their lifetime, under fluctuating selection models. This is because mutations that fluctuate, by chance, to positive selected values, tend to reach higher frequencies in the population than those that fluctuate towards negative values. Hence the evidence of positive adaptive evolution detected under a fluctuating selection model by MK type approaches is genuine since fixed mutations tend to be advantageous on average during their lifetime. Never-the-less we show that methods tend to underestimate the rate of adaptive evolution when selection fluctuates

    Stops making sense: translational trade-offs and stop codon reassignment

    Get PDF
    Background Efficient gene expression involves a trade-off between (i) premature termination of protein synthesis; and (ii) readthrough, where the ribosome fails to dissociate at the terminal stop. Sense codons that are similar in sequence to stop codons are more susceptible to nonsense mutation, and are also likely to be more susceptible to transcriptional or translational errors causing premature termination. We therefore expect this trade-off to be influenced by the number of stop codons in the genetic code. Although genetic codes are highly constrained, stop codon number appears to be their most volatile feature. Results In the human genome, codons readily mutable to stops are underrepresented in coding sequences. We construct a simple mathematical model based on the relative likelihoods of premature termination and readthrough. When readthrough occurs, the resultant protein has a tail of amino acid residues incorrectly added to the C-terminus. Our results depend strongly on the number of stop codons in the genetic code. When the code has more stop codons, premature termination is relatively more likely, particularly for longer genes. When the code has fewer stop codons, the length of the tail added by readthrough will, on average, be longer, and thus more deleterious. Comparative analysis of taxa with a range of stop codon numbers suggests that genomes whose code includes more stop codons have shorter coding sequences. Conclusions We suggest that the differing trade-offs presented by alternative genetic codes may result in differences in genome structure. More speculatively, multiple stop codons may mitigate readthrough, counteracting the disadvantage of a higher rate of nonsense mutation. This could help explain the puzzling overrepresentation of stop codons in the canonical genetic code and most variants

    X-Ray Standing-Wave Investigations of Valence Electronic Structure

    Get PDF
    We have examined the valence-electron emission from Cu, Ge, GaAs, InP, and NiO single crystals under the condition of strong x-ray Bragg reflection; i.e., in the presence of the spatially modulated x-ray standing-wave interference field that is produced by the superposition of the incident and reflected x-ray beams. These crystals span the entire metallic, covalent, and ionic range of solid-state bonding. It is demonstrated that the valenceelectron emission is closely coupled to the atomic cores, even for electron states close to a metallic Fermi edge. Using the bond-orbital approximation, the x-ray standing-wave structure factor for valence-electron emission is derived in terms of the bond polarities and photoionization cross sections of the atoms within the crystalline unit cell and compared to experiment. Additionally, we demonstrated that by exploiting the spatial dependence of the electric-field intensity under Bragg condition, site specific valence electronic structure may be obtained. The technique is demonstrated for GaAs and NiO

    Orthoparamyxovirinae C Proteins Have a Common Origin and a Common Structural Organization

    Get PDF
    The protein C is a small viral protein encoded in an overlapping frame of the P gene in the subfamily Orthoparamyxovirinae. This protein, expressed by alternative translation initiation, is a virulence factor that regulates viral transcription, replication, and production of defective interfering RNA, interferes with the host-cell innate immunity systems and supports the assembly of viral particles and budding. We expressed and purified full-length and an N-terminally truncated C protein from Tupaia paramyxovirus (TupV) C protein (genus Narmovirus). We solved the crystal structure of the C-terminal part of TupV C protein at a resolution of 2.4 Å and found that it is structurally similar to Sendai virus C protein, suggesting that despite undetectable sequence conservation, these proteins are homologous. We characterized both truncated and full-length proteins by SEC-MALLS and SEC-SAXS and described their solution structures by ensemble models. We established a mini-replicon assay for the related Nipah virus (NiV) and showed that TupV C inhibited the expression of NiV minigenome in a concentration-dependent manner as efficiently as the NiV C protein. A previous study found that the Orthoparamyxovirinae C proteins form two clusters without detectable sequence similarity, raising the question of whether they were homologous or instead had originated independently. Since TupV C and SeV C are representatives of these two clusters, our discovery that they have a similar structure indicates that all Orthoparamyxovirine C proteins are homologous. Our results also imply that, strikingly, a STAT1-binding site is encoded by exactly the same RNA region of the P/C gene across Paramyxovirinae, but in different reading frames (P or C), depending on which cluster they belong to.French Agence Nationale de la RechercheFond de la Recherche Médicale (FRM)Grenoble Instruct-ERIC centerFRISBIUniversity Grenoble Alpes graduate school (Ecoles Universitaires de Recherche)Peer Reviewe

    Creative Research Science Experiences for High School Students

    Get PDF
    A French research institute raises the bar for public outreach with an educational laboratory that engages 1,000 high school students per year in mini research projects

    Impact of target site distribution for Type I restriction enzymes on the evolution of methicillin-resistant Staphylococcus aureus (MRSA) populations.

    Get PDF
    A limited number of Methicillin-resistant Staphylococcus aureus (MRSA) clones are responsible for MRSA infections worldwide, and those of different lineages carry unique Type I restriction-modification (RM) variants. We have identified the specific DNA sequence targets for the dominant MRSA lineages CC1, CC5, CC8 and ST239. We experimentally demonstrate that this RM system is sufficient to block horizontal gene transfer between clinically important MRSA, confirming the bioinformatic evidence that each lineage is evolving independently. Target sites are distributed randomly in S. aureus genomes, except in a set of large conjugative plasmids encoding resistance genes that show evidence of spreading between two successful MRSA lineages. This analysis of the identification and distribution of target sites explains evolutionary patterns in a pathogenic bacterium. We show that a lack of specific target sites enables plasmids to evade the Type I RM system thereby contributing to the evolution of increasingly resistant community and hospital MRSA

    Ready ... Go: Amplitude of the fMRI Signal Encodes Expectation of Cue Arrival Time

    Get PDF
    What happens when the brain awaits a signal of uncertain arrival time, as when a sprinter waits for the starting pistol? And what happens just after the starting pistol fires? Using functional magnetic resonance imaging (fMRI), we have discovered a novel correlate of temporal expectations in several brain regions, most prominently in the supplementary motor area (SMA). Contrary to expectations, we found little fMRI activity during the waiting period; however, a large signal appears after the “go” signal, the amplitude of which reflects learned expectations about the distribution of possible waiting times. Specifically, the amplitude of the fMRI signal appears to encode a cumulative conditional probability, also known as the cumulative hazard function. The fMRI signal loses its dependence on waiting time in a “countdown” condition in which the arrival time of the go cue is known in advance, suggesting that the signal encodes temporal probabilities rather than simply elapsed time. The dependence of the signal on temporal expectation is present in “no-go” conditions, demonstrating that the effect is not a consequence of motor output. Finally, the encoding is not dependent on modality, operating in the same manner with auditory or visual signals. This finding extends our understanding of the relationship between temporal expectancy and measurable neural signals
    corecore